-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove HMP from optimum-habana #349
Remove HMP from optimum-habana #349
Conversation
@jwieczorekhabana Thanks for opening this PR! I'll review it in the beginning of next week and I'll try to accelerate on merging the autocast PRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jwieczorekhabana I left several comments, the main point being that #308 enables users to specify their own op lists in Gaudi config files so this should be documented, tested, etc (which should basically consist in changing a few variable names).
For OPT, I need to check if we need to disable autocast for this specific line.
Note that we should not merge this PR before we manage to make Wav2Vec2 work with autocast. It is being investigated but no fix has been merged yet.
elif self.gaudi_config.use_torch_autocast: | ||
# Open temporary files to write mixed-precision ops | ||
with tempfile.NamedTemporaryFile() as hmp_bf16_file: | ||
with tempfile.NamedTemporaryFile() as hmp_fp32_file: | ||
self.gaudi_config.write_bf16_fp32_ops_to_text_files( | ||
hmp_bf16_file.name, | ||
hmp_fp32_file.name, | ||
) | ||
os.environ["LOWER_LIST"] = str(hmp_bf16_file) | ||
os.environ["FP32_LIST"] = str(hmp_fp32_file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should keep this block as it enables to specify custom bf16/fp32 ops for autocast
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually we still need this for GPT2 with autocast, see #308 that I just merged.
Can you rebase your branch and replace the "HMP" mention in the docstring please?
def write_bf16_fp32_ops_to_text_files( | ||
self, | ||
path_to_bf16_file: Path, | ||
path_to_fp32_file: Path, | ||
): | ||
for path, ops in zip( | ||
[Path(path_to_bf16_file), Path(path_to_fp32_file)], [self.hmp_bf16_ops, self.hmp_fp32_ops] | ||
): | ||
with path.open("w") as text_file: | ||
# writelines does not add new lines after each element so "\n" is inserted | ||
text_file.writelines(op + "\n" for op in ops) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need this method to be able to specify autocast custom op lists in a Gaudi config
tests/test_gaudi_configuration.py
Outdated
def test_write_bf16_fp32_ops_to_text_files(self): | ||
gaudi_config = GaudiConfig() | ||
|
||
with tempfile.NamedTemporaryFile() as bf16_file: | ||
with tempfile.NamedTemporaryFile() as fp32_file: | ||
gaudi_config.write_bf16_fp32_ops_to_text_files( | ||
bf16_file.name, | ||
fp32_file.name, | ||
) | ||
|
||
self.assertTrue( | ||
filecmp.cmp( | ||
bf16_file.name, | ||
BF16_OPS_REFERENCE_FILE, | ||
shallow=False, | ||
) | ||
) | ||
self.assertTrue( | ||
filecmp.cmp( | ||
fp32_file.name, | ||
FP32_OPS_REFERENCE_FILE, | ||
shallow=False, | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should keep this test as write_bf16_fp32_ops_to_text_files
will still be used to specify custom op lists.
I propose to instantiate a Gaudi config such as:
gaudi_config = GaudiConfig(
autocast_bf16_ops=[
"add",
"addmm",
"bmm",
"div",
"dropout",
"gelu",
"iadd",
"linear",
"layer_norm",
"matmul",
"mm",
"rsub",
"softmax",
"truediv",
],
autocast_fp32_ops=[
"embedding",
"nll_loss",
"log_softmax",
],
)
- `hmp_bf16_ops` enables to specify the Torch operations that should be computed in *bf16*. You can find more information about casting rules [here](https://docs.habana.ai/en/latest/PyTorch/PyTorch_Mixed_Precision/PT_Mixed_Precision.html#basic-design-rules). | ||
- `hmp_fp32_ops` enables to specify the Torch operations that should be computed in *fp32*. You can find more information about casting rules [here](https://docs.habana.ai/en/latest/PyTorch/PyTorch_Mixed_Precision/PT_Mixed_Precision.html#basic-design-rules). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should mention:
use_torch_autocast
but saying that--bf16
should be favored asuse_torch_autocast
is used to define a good pre-defined configautocast_bf16_ops
andautocast_fp32_ops
as Add support for autocast custom ops inGaudiTrainer
#308 enables users to specify cutom op lists but saying that the default should work for most models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed by email, regarding autocast_bf16_ops
and autocast_fp32_ops
, I'm fine with saying that the env variable way should be favored. But they should still be documented.
@@ -57,44 +57,16 @@ To not take them into account in the computation of the throughput at the end of | |||
## Mixed-Precision Training | |||
|
|||
Mixed-precision training enables to compute some operations using lighter data types to accelerate training. | |||
Habana Mixed Precision (HMP) proposes to mix *fp32* and *bf16* operations. | |||
Optimum-Habana enables mixed precision training in a similar fasion as HuggingFace transofrmers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimum-Habana enables mixed precision training in a similar fasion as HuggingFace transofrmers. | |
Optimum Habana enables mixed precision training in a similar fashion as 🤗 Transformers. |
<Tip warning={true}> | ||
|
||
Please refer to the [list of supported PyTorch operators](https://docs.habana.ai/en/latest/PyTorch/Pytorch_Operators/Pytorch_Operators.html) beforehand to make sure the ones you are interested in are compatible with *bf16*. | ||
|
||
</Tip> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But those operators are incompatible with autocast. HMP and autocast operate on different software levels. Please see: https://docs.habana.ai/en/latest/PyTorch/PyTorch_Mixed_Precision/Autocast.html#override-options
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, we don't have to keep the same operators. Maybe it will just be easier to refer to GPT2's Gaudi config.
Then, you can specify which operators to compute in *bf16* with `"hmp_bf16_ops"` and which operators to compute in *fp32* with `"hmp_fp32_ops"`. | ||
If these operators are not specified, their default values are set to be the ones written in the [Gaudi configuration file of BERT](https://huggingface.co/Habana/bert-large-uncased-whole-word-masking/blob/main/gaudi_config.json), which is a good starting point for applying HMP: | ||
``` | ||
"hmp_bf16_ops": [ | ||
"add", | ||
"addmm", | ||
"bmm", | ||
"div", | ||
"dropout", | ||
"gelu", | ||
"iadd", | ||
"linear", | ||
"layer_norm", | ||
"matmul", | ||
"mm", | ||
"rsub", | ||
"softmax", | ||
"truediv" | ||
], | ||
"hmp_fp32_ops": [ | ||
"embedding", | ||
"nll_loss", | ||
"log_softmax" | ||
] | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would still keep a part of this to show how to specify custom op lists. We can add a link to the GPT2 Gaudi config when it is updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But shouldn't users provide custom lists in a similar way to other training demos outside of HuggingFace? We can keep those in GaudiConfig to make sure they are optimized for specific model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO users should be able to do both because those already used to Optimum Habana probably have Gaudi configs with custom op lists, so switching to Autocast will be easy and they won't be confused.
cba7f71
to
184acc3
Compare
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating the PR! I left a few comments.
Could you also run the following to make the code style check pass please?
pip install black ruff
make style
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Let's wait to know what happens with Wav2Vec2 training before merging anything cc @libinta
d668dfb
to
fdb4d92
Compare
@jwieczorekhabana will the Gaudi config for Wav2Vec2 be the following? {
"use_fused_adam": true,
"use_fused_clip_norm": true,
"use_torch_autocast": true
} Or will there be any specific bf16 ops? |
That'll be it. There is a hanging PR on HuggingFace https://huggingface.co/Habana/wav2vec2/discussions/2/files that can be merged after 1.13 release. Release 1.13 addresses issues both of those topologies had. |
Cool, I'll merge them when 1.13 is released then 👍 |
@jwieczorekhabana Can you rebase this branch on main? There are merge conflicts in |
HMP is deprecated in favor of PyTorch autocast - removed hmp usage - removed setting autocast env variables through GaudiConfig - updated tests - updated docs - updated README.md files
- restored test_gaudi_config imports
Co-authored-by: regisss <[email protected]>
1f164dd
to
cc84d33
Compare
@regisss Done |
@jwieczorekhabana It seems there are still some merge conflicts in |
Change-Id: Ibbd216b9b6b5c9104b7dce3021f231eda3748704
@jwieczorekhabana I pushed a few commits to solve the CI that was not passing |
HMP is deprecated in favor of PyTorch autocast